Skip to content

feat(monitor): add first-class background monitor module#3382

Merged
senamakel merged 6 commits into
tinyhumansai:mainfrom
senamakel:issue/3371-add-a-first-class-monitor-module-for-bac
Jun 4, 2026
Merged

feat(monitor): add first-class background monitor module#3382
senamakel merged 6 commits into
tinyhumansai:mainfrom
senamakel:issue/3371-add-a-first-class-monitor-module-for-bac

Conversation

@senamakel

@senamakel senamakel commented Jun 4, 2026

Copy link
Copy Markdown
Member

Summary

  • Adds a first-class src/openhuman/monitor/ domain for background command monitors.
  • Registers monitor, monitor_list, monitor_stop, and monitor_read through the agent tool facade.
  • Exposes openhuman.monitor_{start,list,stop,read} controller/RPC methods and monitor lifecycle events.
  • Streams stdout/stderr lines into bounded workspace output while queuing concise collect-context into active agent turns.
  • Updates About App and the coverage matrix for the new background monitor capability.

Problem

  • Agents can run shell commands and manually poll logs/status, but there was no core-owned background event source for long-running monitoring workflows.
  • A monitor implementation buried under generic system tools would bypass the repo's domain ownership pattern and make lifecycle/state hard to expose consistently.

Solution

  • Introduce openhuman::monitor with domain-owned types, store, runner, ops, schemas, tools, and README documentation.
  • Reuse shell-equivalent security expectations: command classification, gated-command checks, execute permission, rate limiting, action directory, safe environment, and audit logging.
  • Keep raw output bounded under <workspace>/monitor/*.log; keep recent structured events in the monitor store for list/read surfaces.
  • Extend parent execution context with the current run queue so monitor lines started inside an active turn are injected as collect at existing safe iteration boundaries.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/pr-ci.yml. Run pnpm test:coverage and pnpm test:rust locally; PRs below 80% on changed lines will not merge. Local full diff-cover not run; focused changed-line coverage is covered by Rust unit/RI tests and CI will enforce the merged gate.
  • Coverage matrix updated — added/removed/renamed feature rows in docs/TEST-COVERAGE-MATRIX.md reflect this change (or N/A: behaviour-only change)
  • All affected feature IDs from the matrix are listed in the PR description under ## Related
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md) — N/A: core agent/tool surface only, no release-cut manual smoke flow changed.
  • Linked issue closed via Closes #NNN in the ## Related section

Impact

  • Runtime/platform impact: Rust core and controller/tool registry only; no React/Tauri UI files changed.
  • Security: monitor commands follow execute-class shell policy checks and use a safe environment allowlist; raw output is bounded and stored outside model history.
  • Compatibility: existing shell/tool behavior is preserved; run-queue injection is additive and only used when a monitor is started inside an active parent context.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Keep this section for AI-authored PRs. For human-only PRs, mark each field N/A.

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: issue/3371-add-a-first-class-monitor-module-for-bac
  • Commit SHA: 6081ec6c7

Validation Run

  • pnpm --filter openhuman-app format:check — passed via pre-push hook.
  • pnpm typecheck — passed via pre-push hook (pnpm compile).
  • Focused tests:
    • cargo test --manifest-path Cargo.toml monitor:: --lib
    • cargo test --manifest-path Cargo.toml monitor::ops::tests --lib
    • cargo test --manifest-path Cargo.toml --test monitor_agent_e2e
    • cargo test --manifest-path Cargo.toml --test calendar_grounding_e2e --no-run
    • cargo test --manifest-path Cargo.toml --tests --no-run
    • cargo test --manifest-path Cargo.toml run_queue --lib
    • cargo test --manifest-path Cargo.toml --test json_rpc_e2e json_rpc_monitor_list_and_read_surface
    • cargo test --manifest-path Cargo.toml all_tools_default_registry_contains_expected_baseline_surface --lib
  • Rust fmt/check (if changed): cargo fmt --manifest-path Cargo.toml; cargo check --manifest-path Cargo.toml; pre-push pnpm rust:check passed.
  • Tauri fmt/check (if changed): pre-push cargo fmt --manifest-path app/src-tauri/Cargo.toml --all --check and cargo check --manifest-path app/src-tauri/Cargo.toml passed.

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: agents can start, list, stop, and read bounded background command monitors from a first-class core module.
  • User-visible effect: monitor output can be collected into active agent turns and inspected through RPC/tool surfaces.

Parity Contract

  • Legacy behavior preserved: existing shell tool and run queue behavior remain unchanged unless a monitor is explicitly started.
  • Guard/fallback/dispatch parity checks: monitor command gating mirrors shell command classification and policy checks; controller registry and tool facade tests cover exposure.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: this PR
  • Resolution (closed/superseded/updated): N/A

Summary by CodeRabbit

  • New Features
    • Added Background Monitors functionality enabling users to start, list, stop, and read background command monitors with bounded output storage and lifecycle event publishing.
    • Monitor operations now available via JSON-RPC interface with security policy enforcement and resource limits.

@senamakel senamakel requested a review from a team June 4, 2026 15:41
@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Introduces a first-class openhuman::monitor module for background command monitoring: defines monitor types and events, implements an in-memory store and subprocess runner with bounded output/event injection, provides RPC handlers and tool wrappers, extends the domain event system, propagates run-queue context to sub-agents, and integrates everything into the core controller registry.

Changes

Background Monitor Domain

Layer / File(s) Summary
Monitor types and event system integration
src/openhuman/monitor/types.rs, src/core/event_bus/events.rs
Defines monitor data models (status/stream enums, request/response structs, snapshots with recent events); extends DomainEvent with MonitorStatusChanged and MonitorLine variants for lifecycle/output publishing.
Store and lifecycle management
src/openhuman/monitor/store.rs
Implements process-wide in-memory registry tracking active monitors (snapshots + stop signals), with methods to insert/list/get monitors, push events, update status, and stop monitors.
Subprocess execution and output streaming
src/openhuman/monitor/runner.rs
Executes monitored shell commands with environment sanitization, line-buffered stdout/stderr reading, timeout/stop handling, bounded log writing, event emission, and run-queue message injection.
RPC operation handlers
src/openhuman/monitor/ops.rs
Orchestrates monitor lifecycle (start, start_default, list, stop, read): validates commands via security policy, manages output directories, spawns runners, and retrieves bounded output with truncation metadata.
Schema definitions and RPC registration
src/openhuman/monitor/schemas.rs
Defines monitor controller schemas and async handler functions, registers all controllers and schemas for RPC dispatch.
Agent-callable tool implementations
src/openhuman/monitor/tools.rs
Implements MonitorTool, MonitorListTool, MonitorStopTool, MonitorReadTool as Tool trait implementations with security gating and JSON result handling.
Agent run-queue context propagation
src/openhuman/agent/harness/fork_context.rs, src/openhuman/agent/harness/session/turn.rs, src/openhuman/agent/harness/subagent_runner/ops_tests.rs, src/openhuman/agent/triage/escalation.rs, src/openhuman/agent_orchestration/*
Adds optional run_queue field to ParentExecutionContext and propagates it through sub-agent spawn sites for event injection coordination.
Core integration and module exposure
src/openhuman/mod.rs, src/core/all.rs, src/openhuman/tools/mod.rs, src/openhuman/tools/ops.rs, src/openhuman/tools/ops_tests.rs, src/openhuman/about_app/catalog_data.rs
Declares monitor module in openhuman, registers monitor controllers/schemas in dispatcher, re-exports monitor tools, adds background_monitors capability, and updates baseline tool registry assertions.
Module documentation and test coverage
src/openhuman/monitor/README.md, src/openhuman/monitor/mod.rs, tests/json_rpc_e2e.rs, docs/TEST-COVERAGE-MATRIX.md
Adds comprehensive module README describing ownership/security/event contracts, module entrypoint with public submodule re-exports, JSON-RPC E2E test for list/read surface, and test coverage matrix entry.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • tinyhumansai/openhuman#3050: Extends tool registry wiring in src/openhuman/tools/ops.rs and tools/mod.rs, similar infrastructure pattern to monitor tool exposure.
  • tinyhumansai/openhuman#3305: Modifies src/core/event_bus/events.rs domain routing and variant registration, same pattern as monitor event variants.

Suggested labels

feature, rust-core, agent

Suggested reviewers

  • graycyrus
  • sanil-23

Poem

🐰 A monitor stirs in the digital night,
Background commands running out of sight,
Bounded logs whisper their streaming song,
While events dance down the run-queue all day long,
No more polling—the future arrives in one bite! 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 59.76% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(monitor): add first-class background monitor module' directly and clearly describes the main change—introducing a dedicated monitor module as a first-class domain, which is the primary objective of the PR.
Linked Issues check ✅ Passed All primary objectives from issue #3371 are met: first-class monitor module structure created at src/openhuman/monitor/ with types, store, runner/ops, schemas, tools, and documentation; agent-callable tools (monitor, monitor_list, monitor_stop, monitor_read) registered and exposed; shell-equivalent security enforced; background subprocess execution with line streaming, timeouts, and stop support; run-queue event injection via parent context; bounded output storage; and comprehensive tests validating security, streaming, timeout, stop, output bounding, and run-queue behavior.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing the background monitor module and its integration: new monitor domain files, event system extensions, tool registry updates, parent context field additions for run-queue passing, test helper updates, and documentation additions. No unrelated refactoring or auxiliary changes detected.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. labels Jun 4, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
tests/json_rpc_e2e.rs (1)

1030-1033: ⚡ Quick win

Assert the JSON-RPC error.message field directly

Using error.to_string().contains("mon_missing") is brittle to envelope/serialization changes. Assert against error["message"] to pin the transport contract explicitly.

Suggested change
-    assert!(
-        error.to_string().contains("mon_missing"),
-        "missing monitor error should name id: {error}"
-    );
+    let msg = error
+        .get("message")
+        .and_then(Value::as_str)
+        .unwrap_or_default();
+    assert!(
+        msg.contains("mon_missing"),
+        "missing monitor error should name id in error.message: {error}"
+    );
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/json_rpc_e2e.rs` around lines 1030 - 1033, The current assertion checks
the error string via error.to_string(), which is brittle; instead index into the
JSON-RPC error object and assert on its "message" field directly (e.g., use
error["message"] or error.get("message") on the serde_json::Value named error)
to verify it contains or equals "mon_missing" so the transport contract is
pinned; update the assert!(...) that references error.to_string() to check the
JSON field of error (and adjust to handle Option/Type accordingly).
src/openhuman/monitor/runner.rs (1)

114-195: 💤 Low value

Reader tasks are not joined after child process exits.

After child.wait() completes and remaining lines are drained via try_recv, the spawned reader tasks (spawn_reader) are left detached. While they will eventually terminate when the pipe handles are closed, there's a brief window where they could still be trying to send on the closed channel.

This is unlikely to cause issues since tx.send() returns an error when the receiver is dropped (which is handled), but for completeness you could store the JoinHandles and await them after draining.

This is a minor robustness concern - the current implementation is functionally correct.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/monitor/runner.rs` around lines 114 - 195, The reader tasks
spawned by spawn_reader in run_child are not awaited after child.wait(), leaving
detached JoinHandles that may race sending to the channel; capture the
JoinHandle returned by each spawn_reader call (e.g. let stdout_handle =
spawn_reader(...), let stderr_handle = spawn_reader(...)), drop or close the
sender (line_tx) after draining remaining messages, then await those handles
(stdout_handle.await and stderr_handle.await) before returning from the
child.wait() branch so the reader tasks are properly joined and any errors are
observed.
src/openhuman/monitor/types.rs (1)

89-119: 💤 Low value

Inconsistent serde field naming between response types.

MonitorStartResponse, MonitorStopResponse, and MonitorReadResponse use camelCase renames (monitorId, outputFile) while MonitorSnapshot (used in MonitorListResponse) uses snake_case. This creates an inconsistent API surface where monitor_list returns snake_case fields but other endpoints return camelCase for the same conceptual fields.

Consider unifying: either apply #[serde(rename_all = "camelCase")] to all response types or keep everything snake_case.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/monitor/types.rs` around lines 89 - 119, The issue is
inconsistent serde naming across response types; unify to camelCase by applying
#[serde(rename_all = "camelCase")] to MonitorSnapshot and MonitorListResponse
(and any other structs in this module lacking it) so field names like
monitor_id/output_file are serialized the same as MonitorStartResponse,
MonitorStopResponse, and MonitorReadResponse; update or remove any per-field
#[serde(rename = "...")] annotations as needed so all structs
(MonitorStartResponse, MonitorStopResponse, MonitorReadResponse,
MonitorSnapshot, MonitorListResponse) consistently use camelCase.
src/openhuman/monitor/tools.rs (1)

61-71: ⚡ Quick win

Centralize monitor command classification to prevent gate drift.

This classification/escalation block duplicates src/openhuman/monitor/ops.rs (start), which can desync approval gating (external_effect_with_args) from execute-time enforcement over time. Extract a shared helper and reuse it in both places.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/monitor/tools.rs` around lines 61 - 71, The command
classification/escalation logic in external_effect_with_args (using
security.classify_command and optional SecurityPolicy::parse_declared_class then
security.gate_decision) is duplicated in src/openhuman/monitor/ops.rs (start);
extract that logic into a single helper (e.g., classify_and_escalate_command or
classify_command_with_declared) in monitor::tools, have the helper accept
(&self, command: &str, declared_category: Option<&str>) and return the final
GateDecision or resulting Class, then replace the inline code in
external_effect_with_args and the start function in ops.rs to call the new
helper so both use the same classification/escalation path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/monitor/ops.rs`:
- Around line 12-170: Add debug/trace logging to the monitor ops: instrument the
functions start, start_default, list, stop, and read to emit stable tracing/log
events at key points (entry, important branch decisions, external calls, error
returns, and state transitions). For start, log the incoming command, computed
class, gate decision, rate-limit decision, generated monitor_id, thread_id and
session_id, and the start result; for start_default log config load
success/failure and the created SecurityPolicy; for list log the count of
monitors returned; for stop and read log the requested monitor_id and the
resulting snapshot or output path and any errors. Ensure logs include
correlation fields monitor_id, thread_id, session_id where available and use
debug/trace level via the project's tracing/log macros.

In `@src/openhuman/monitor/tools.rs`:
- Around line 74-87: Add stable-prefixed debug/trace logs in execute around
parsing, the delegated ops::start call, and both success/error branches: log
before serde_json::from_value with the incoming args and a correlation id if
present, log immediately before calling ops::start including cloned handles
(self.security, self.runtime, self.audit) and any request identifiers, log on
Ok(outcome) with the outcome.value and exit marker, and log on Err(error)
including error details and exit marker; ensure logs use a consistent
grep-friendly prefix (e.g., "monitor:tool:execute:") and appropriate levels
(debug/trace for entry/exit and branches, error for failures) so the
MonitorStartRequest -> ops::start -> ToolResult flow is fully instrumented.

---

Nitpick comments:
In `@src/openhuman/monitor/runner.rs`:
- Around line 114-195: The reader tasks spawned by spawn_reader in run_child are
not awaited after child.wait(), leaving detached JoinHandles that may race
sending to the channel; capture the JoinHandle returned by each spawn_reader
call (e.g. let stdout_handle = spawn_reader(...), let stderr_handle =
spawn_reader(...)), drop or close the sender (line_tx) after draining remaining
messages, then await those handles (stdout_handle.await and stderr_handle.await)
before returning from the child.wait() branch so the reader tasks are properly
joined and any errors are observed.

In `@src/openhuman/monitor/tools.rs`:
- Around line 61-71: The command classification/escalation logic in
external_effect_with_args (using security.classify_command and optional
SecurityPolicy::parse_declared_class then security.gate_decision) is duplicated
in src/openhuman/monitor/ops.rs (start); extract that logic into a single helper
(e.g., classify_and_escalate_command or classify_command_with_declared) in
monitor::tools, have the helper accept (&self, command: &str, declared_category:
Option<&str>) and return the final GateDecision or resulting Class, then replace
the inline code in external_effect_with_args and the start function in ops.rs to
call the new helper so both use the same classification/escalation path.

In `@src/openhuman/monitor/types.rs`:
- Around line 89-119: The issue is inconsistent serde naming across response
types; unify to camelCase by applying #[serde(rename_all = "camelCase")] to
MonitorSnapshot and MonitorListResponse (and any other structs in this module
lacking it) so field names like monitor_id/output_file are serialized the same
as MonitorStartResponse, MonitorStopResponse, and MonitorReadResponse; update or
remove any per-field #[serde(rename = "...")] annotations as needed so all
structs (MonitorStartResponse, MonitorStopResponse, MonitorReadResponse,
MonitorSnapshot, MonitorListResponse) consistently use camelCase.

In `@tests/json_rpc_e2e.rs`:
- Around line 1030-1033: The current assertion checks the error string via
error.to_string(), which is brittle; instead index into the JSON-RPC error
object and assert on its "message" field directly (e.g., use error["message"] or
error.get("message") on the serde_json::Value named error) to verify it contains
or equals "mon_missing" so the transport contract is pinned; update the
assert!(...) that references error.to_string() to check the JSON field of error
(and adjust to handle Option/Type accordingly).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bb62e931-e7ae-4330-a4bf-4225c45f9f8f

📥 Commits

Reviewing files that changed from the base of the PR and between ce3ac82 and b1f8d97.

📒 Files selected for processing (25)
  • docs/TEST-COVERAGE-MATRIX.md
  • src/core/all.rs
  • src/core/event_bus/events.rs
  • src/openhuman/about_app/catalog_data.rs
  • src/openhuman/agent/harness/fork_context.rs
  • src/openhuman/agent/harness/session/turn.rs
  • src/openhuman/agent/harness/subagent_runner/ops_tests.rs
  • src/openhuman/agent/triage/escalation.rs
  • src/openhuman/agent_orchestration/ops_tests.rs
  • src/openhuman/agent_orchestration/tools/spawn_parallel_agents_tests.rs
  • src/openhuman/agent_orchestration/tools/spawn_worker_thread.rs
  • src/openhuman/agent_orchestration/tools/tools_e2e_tests.rs
  • src/openhuman/mod.rs
  • src/openhuman/monitor/README.md
  • src/openhuman/monitor/mod.rs
  • src/openhuman/monitor/ops.rs
  • src/openhuman/monitor/runner.rs
  • src/openhuman/monitor/schemas.rs
  • src/openhuman/monitor/store.rs
  • src/openhuman/monitor/tools.rs
  • src/openhuman/monitor/types.rs
  • src/openhuman/tools/mod.rs
  • src/openhuman/tools/ops.rs
  • src/openhuman/tools/ops_tests.rs
  • tests/json_rpc_e2e.rs

Comment thread src/openhuman/monitor/ops.rs
Comment thread src/openhuman/monitor/tools.rs
@senamakel senamakel merged commit f514bec into tinyhumansai:main Jun 4, 2026
19 of 22 checks passed
senamakel added a commit to senamakel/openhuman that referenced this pull request Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a first-class monitor module for background agent events

1 participant